118 research outputs found

    Bolt: Accelerated Data Mining with Fast Vector Compression

    Full text link
    Vectors of data are at the heart of machine learning and data mining. Recently, vector quantization methods have shown great promise in reducing both the time and space costs of operating on vectors. We introduce a vector quantization algorithm that can compress vectors over 12x faster than existing techniques while also accelerating approximate vector operations such as distance and dot product computations by up to 10x. Because it can encode over 2GB of vectors per second, it makes vector quantization cheap enough to employ in many more circumstances. For example, using our technique to compute approximate dot products in a nested loop can multiply matrices faster than a state-of-the-art BLAS implementation, even when our algorithm must first compress the matrices. In addition to showing the above speedups, we demonstrate that our approach can accelerate nearest neighbor search and maximum inner product search by over 100x compared to floating point operations and up to 10x compared to other vector quantization methods. Our approximate Euclidean distance and dot product computations are not only faster than those of related algorithms with slower encodings, but also faster than Hamming distance computations, which have direct hardware support on the tested platforms. We also assess the errors of our algorithm's approximate distances and dot products, and find that it is competitive with existing, slower vector quantization algorithms.Comment: Research track paper at KDD 201

    Special Student Project: Developments under the Surface Mining Control and Reclamation Act of 1977

    Get PDF
    The Surface Mining Control and Reclamation Act of 1977 (SMCRA) is one of the most significant enactments ever to affect the coal mining industry. In pervasive fashion, it is intended to control virtually every environmental aspect of surface mining as well as all surface effects of underground coal mining. The responsibility for establishing a regulatory program to refine and implement the Act is vested in the United States Department of the Interior. However, as individual regulatory plans are submitted by the states and approved by the Secretary of the Interior, the Act provides for an assumption by the states of primary regulatory authority over mining activities conducted within their borders. As of mid-1980, no state except Texas had assumed primary regulatory authority. Proposed amendments to the SMCRA, changes and uncertainties in the model regulatory program as promulgated by the Office of Surface Mining Reclamation and Enforcement (OSM), and challenges to OSM\u27s authority to regulate certain aspects of coal mining have all contributed to the delay in the states\u27 assumption of primary regulatory authority. This Project is intended to note the significant changes and challenges to the SMCRA and to the regulations promulgated thereunder over the period beginning with the issuance of the permanent regulatory program until the present time

    A review of the polygraph: history, methodology and current status

    Get PDF
    The history of research into psychophysiological measurements as an aid to detecting lying, widely known as the ‘lie detector’ or polygraph is the focus of this review. The physiological measurements used are detailed and the debates that exist in regards to its role in the investigative process are introduced. Attention is given to the main polygraph testing methods, namely the Comparative Question Test and the Concealed Information Test. Discussion of these two central methods, their uses and problems forms the basis of the review. Recommendations for future research are made specifically in regards to improving current polygraph technology and exploring the role of the polygraph in combination with other deception detection techniques

    Engaging diverse underserved communities to bridge the mammography divide

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Breast cancer screening continues to be underutilized by the population in general, but is particularly underutilized by traditionally underserved minority populations. Two of the most at risk female minority groups are American Indians/Alaska Natives (AI/AN) and Latinas. American Indian women have the poorest recorded 5-year cancer survival rates of any ethnic group while breast cancer is the number one cause of cancer mortality among Latina women. Breast cancer screening rates for both minority groups are near or at the lowest among all racial/ethnic groups. As with other health screening behaviors, women may intend to get a mammogram but their intentions may not result in initiation or follow through of the examination process. An accumulating body of research, however, demonstrates the efficacy of developing 'implementation intentions' that define when, where, and how a specific behavior will be performed. The formulation of intended steps in addition to addressing potential barriers to test completion can increase a person's self-efficacy, operationalize and strengthen their intention to act, and close gaps between behavioral intention and completion. To date, an evaluation of the formulation of implementation intentions for breast cancer screening has not been conducted with minority populations.</p> <p>Methods/Design</p> <p>In the proposed program, community health workers will meet with rural-dwelling Latina and American Indian women one-on-one to educate them about breast cancer and screening and guide them through a computerized and culturally tailored "implementation intentions" program, called <it>Healthy Living Kansas - Breast Health</it>, to promote breast cancer screening utilization. We will target Latina and AI/AN women from two distinct rural Kansas communities. Women attending community events will be invited by CHWs to participate and be randomized to either a mammography "implementation intentions" (<b>MI</b><sup><b>2</b></sup>) intervention or a comparison general breast cancer prevention informational intervention (<b>C</b>). CHWs will be armed with notebook computers loaded with our Healthy Living Kansas - Breast Health program and guide their peers through the program. Women in the <b>MI</b><sup><b>2 </b></sup>condition will receive assistance with operationalizing their screening intentions and identifying and addressing their stated screening barriers with the goal of guiding them toward accessing screening services near their community. Outcomes will be evaluated at 120-days post randomization via self-report and will include mammography utilization status, barriers, and movement along a behavioral stages of readiness to screen model.</p> <p>Discussion</p> <p>This highly innovative project will be guided and initiated by AI/AN and Latina community members and will test the practical application of emerging behavioral theory among minority persons living in rural communities.</p> <p>Trial Registration</p> <p>ClinicalTrials (NCT): <a href="http://www.clinicaltrials.gov/ct2/show/NCT01267110">NCT01267110</a></p

    Definitions and methods of measuring and reporting on injurious falls in randomised controlled fall prevention trials: a systematic review

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The standardisation of the assessment methodology and case definition represents a major precondition for the comparison of study results and the conduction of meta-analyses. International guidelines provide recommendations for the standardisation of falls methodology; however, injurious falls have not been targeted. The aim of the present article was to review systematically the range of case definitions and methods used to measure and report on injurious falls in randomised controlled trials (RCTs) on fall prevention.</p> <p>Methods</p> <p>An electronic literature search of selected comprehensive databases was performed to identify injurious falls definitions in published trials. Inclusion criteria were: RCTs on falls prevention published in English, study population ≥ 65 years, definition of injurious falls as a study endpoint by using the terms "injuries" and "falls".</p> <p>Results</p> <p>The search yielded 2089 articles, 2048 were excluded according to defined inclusion criteria. Forty-one articles were included. The systematic analysis of the methodology applied in RCTs disclosed substantial variations in the definition and methods used to measure and document injurious falls. The limited standardisation hampered comparability of study results. Our results also highlight that studies which used a similar, standardised definition of injurious falls showed comparable outcomes.</p> <p>Conclusions</p> <p>No standard for defining, measuring, and documenting injurious falls could be identified among published RCTs. A standardised injurious falls definition enhances the comparability of study results as demonstrated by a subgroup of RCTs used a similar definition. Recommendations for standardising the methodology are given in the present review.</p

    Feature Flocks : accurate pattern discovery in multivariate signals

    No full text
    Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.Cataloged from PDF version of thesis.Includes bibliographical references (pages 81-85).Thanks to the rise of wearable and connected devices, sensor-generated time series comprise a large and growing fraction of the world's data. Unfortunately, extracting value from this data can be challenging, since sensors can only report low-level signals (e.g., acceleration), not the high-level phenomena that are typically of interest (e.g., gestures). We introduce a technique to bridge this gap by automatically learning to identify real-world events in low-level data with no human labeling. By identifying "flocks" of features that repeat in the same temporal arrangement, we learn to recognize such diverse phenomena as human actions, power consumption patterns, and spoken words with up to 96% precision and recall. Our method is fast enough to run in real time and assumes only minimal knowledge of which variables are relevant or how long patterns are. Our evalation uses numerous publicly available datasets and over 1 million samples of sensor data in which we manually labeled ground truth.by Davis W. Blalock.S.M

    EXTRACT: Strong Examples from Weakly-Labeled Sensor Data

    No full text
    © 2016 IEEE. Thanks to the rise of wearable and connected devices, sensor-generated time series comprise a large and growing fraction of the world's data. Unfortunately, extracting value from this data can be challenging, since sensors report low-level signals (e.g., acceleration), not the high-level events that are typically of interest (e.g., gestures). We introduce a technique to bridge this gap by automatically extracting examples of real-world events in low-level data, given only a rough estimate of when these events have taken place. By identifying sets of features that repeat in the same temporal arrangement, we isolate examples of such diverse events as human actions, power consumption patterns, and spoken words with up to 96% precision and recall. Our method is fast enough to run in real time and assumes only minimal knowledge of which variables are relevant or the lengths of events. Our evaluation uses numerous publicly available datasets and over 1 million samples of manually labeled sensor data

    Bolt: Accelerated Data Mining with Fast Vector Compression

    No full text
    © 2017 Copyright held by the owner/author(s). Vectors of data are at the heart of machine learning and data mining. Recently, vector quantization methods have shown great promise in reducing both the time and space costs of operating on vectors. We introduce a vector quantization algorithm that can compress vectors over 12x faster than existing techniques while also accelerating approximate vector operations such as distance and dot product computations by up to 10x. Because it can encode over 2GB of vectors per second, it makes vector quantization cheap enough to employ in many more circumstances. For example, using our technique to compute approximate dot products in a nested loop can multiply matrices faster than a state-of-the-art BLAS implementation, even when our algorithm must first compress the matrices. In addition to showing the above speedups, we demonstrate that our approach can accelerate nearest neighbor search and maximum inner product search by over 100x compared to floating point operations and up to 10x compared to other vector quantization methods. Our approximate Euclidean distance and dot product computations are not only faster than those of related algorithms with slower encodings, but also faster than Hamming distance computations, which have direct hardware support on the tested platforms. We also assess the errors of our algorithm's approximate distances and dot products, and find that it is competitive with existing, slower vector quantization algorithms

    Actinic lichen nitidus

    No full text
    We present the case of a 29-year-old black female with an initial clinical and histopathologic diagnosis of actinic lichen nitidus. Three years later, she presented with scattered hyperpigmented macules with oval pink/viol-aceous plaques bilaterally on her forearms and on her neck, clinically consistent with actinic lichen planus. She was treated with topical steroids at each visit, with subsequent resolution of her lesions. In this report, we discuss the spectrum of actinic lichenoid dermatoses and of disease that presents even in the same patient. Case Repor
    • …
    corecore